Robust distant speech recognition based on position dependent CMN using a novel multiple microphone processing technique
نویسندگان
چکیده
In a distant environment, channel distortion may drastically degrade speech recognition performances. In this paper, we propose a robust multiple microphone speech processing approach based on position dependent Cepstral Mean Normalization (CMN). In the training stage, the system measures the transmission characteristics according to the speaker positions from some grid points in the room and estimated the compensation parameters a priori. In the recognition stage, the system estimates the speaker position and adopts the estimated compensation parameters corresponding to the estimated position, and then the system applies the CMN to the speech and performs speech recognition for each microphone. Finally, the maximum vote or the maximum summation likelihood of whole channels (that is, multiple microphones) is used to obtain the final result. In our proposed method, we use utterances emitted from a loudspeaker located at various positions to estimate compensation parameters for a convenient sake, and we also compensate the mismatch between the cepstral means of utterances spoken by human and those emitted from the loudspeaker. Our experiments showed that the proposed method improved the performances of speech recognition system in a distant environment efficiently and it could also compensate the mismatch between voices from human and loudspeaker well.
منابع مشابه
Robust Distant Speech Recognition by Combining Multiple Microphone-Array Processing with Position-Dependent CMN
We propose robust distant speech recognition by combining multiple microphone-array processing with position-dependent cepstral mean normalization (CMN). In the recognition stage, the system estimates the speaker position and adopts compensation parameters estimated a priori corresponding to the estimated position. Then the system applies CMN to the speech (i.e., positiondependent CMN) and perf...
متن کاملRobust distant speaker recognition based on position dependent cepstral mean normalization
In a distant environment, channel distortion may drastically degrade speaker recognition performance. In this paper, we propose a robust speaker recognition method based on position dependent Cepstral Mean Normalization (CMN) to compensate the channel distortion depending on the speaker position. It is shown in [1] that the position dependent CMN is robust for speech recognition in a distant en...
متن کاملRobust Speech Recognition in Distant Environment Based on Speaker Position and Speaking Direction Detection
In a practical environment, channel distortion may severly degrade speech recognition performance. In this paper, we propose a robust speech recognition method using real-time Cepstral Mean Normalization (CMN) [1] based on speaker position and speaking direction detection. We first estimate the speaker position in a 3-D space based on the time delay of arrival (TDOA) between distinct microphone...
متن کاملRobust distant speech recognition based on position dependent CMN
In a distant environment, channel distortion may dramatically degrade speech recognition performance. In this paper, we propose a robust speech recognition method based on position dependent Cepstral Mean Normalization (CMN). At first the system measures the transmission characteristics according to the speaker positions from some grid points in the room a priori. In the recognition stage, the ...
متن کاملRobust distant speaker recognition based on position-dependent CMN by combining speaker-specific GMM with speaker-adapted HMM
In this paper, we propose a robust speaker recognition method based on position-dependent Cepstral Mean Normalization (CMN) to compensate for the channel distortion depending on the speaker position. In the training stage, the system measures the transmission characteristics according to the speaker positions from some grid points to the microphone in the room and estimates the compensation par...
متن کامل